5 research outputs found
From usability to secure computing and back again
Secure multi-party computation (MPC) allows multiple parties
to jointly compute the output of a function while preserving
the privacy of any individual party’s inputs to that function.
As MPC protocols transition from research prototypes to realworld
applications, the usability of MPC-enabled applications
is increasingly critical to their successful deployment and
widespread adoption. Our Web-MPC platform, designed with
a focus on usability, has been deployed for privacy-preserving
data aggregation initiatives with the City of Boston and the
Greater Boston Chamber of Commerce. After building and
deploying an initial version of the platform, we conducted a
heuristic evaluation to identify usability improvements and
implemented corresponding application enhancements. However,
it is difficult to gauge the effectiveness of these changes
within the context of real-world deployments using traditional
web analytics tools without compromising the security guarantees
of the platform. This work consists of two contributions
that address this challenge: (1) the Web-MPC platform has
been extended with the capability to collect web analytics
using existing MPC protocols, and (2) as a test of this feature
and a way to inform future work, this capability has been
leveraged to conduct a usability study comparing the two versions
ofWeb-MPC. While many efforts have focused on ways
to enhance the usability of privacy-preserving technologies,
this study serves as a model for using a privacy-preserving
data-driven approach to evaluate and enhance the usability of
privacy-preserving websites and applications deployed in realworld
scenarios. Data collected in this study yields insights
into the relationship between usability and security; these can
help inform future implementations of MPC solutions.Published versio
Lexicographically Fair Learning: Algorithms and Generalization
We extend the notion of minimax fairness in supervised learning problems to its natural conclusion: lexicographic minimax fairness (or lexifairness for short). Informally, given a collection of demographic groups of interest, minimax fairness asks that the error of the group with the highest error be minimized. Lexifairness goes further and asks that amongst all minimax fair solutions, the error of the group with the second highest error should be minimized, and amongst all of those solutions, the error of the group with the third highest error should be minimized, and so on. Despite its naturalness, correctly defining lexifairness is considerably more subtle than minimax fairness, because of inherent sensitivity to approximation error. We give a notion of approximate lexifairness that avoids this issue, and then derive oracle-efficient algorithms for finding approximately lexifair solutions in a very general setting. When the underlying empirical risk minimization problem absent fairness constraints is convex (as it is, for example, with linear and logistic regression), our algorithms are provably efficient even in the worst case. Finally, we show generalization bounds - approximate lexifairness on the training sample implies approximate lexifairness on the true distribution with high probability. Our ability to prove generalization bounds depends on our choosing definitions that avoid the instability of naive definitions
Non-parametric differentially private confidence intervals for the median
https://arxiv.org/abs/2106.1033
Improved Differentially Private Analysis of Variance
Hypothesis testing is one of the most common types of data analysis and forms the backbone of scientific research in many disciplines. Analysis of variance (ANOVA) in particular is used to detect dependence between a categorical and a numerical variable. Here we show how one can carry out this hypothesis test under the restrictions of differential privacy. We show that the F -statistic, the optimal test statistic in the public setting, is no longer optimal in the private setting, and we develop a new test statistic F1 with much higher statistical power. We show how to rigorously compute a reference distribution for the F1 statistic and give an algorithm that outputs accurate p-values. We implement our test and experimentally optimize several parameters. We then compare our test to the only previous work on private ANOVA testing, using the same effect size as that work. We see an order of magnitude improvement, with our test requiring only 7% as much data to detect the effect